HBM-Resident Prefetching for Heterogeneous Memory System
نویسندگان
چکیده
To meet the increasing demands for very large memory capacities, bandwidth and energy efficiency, researchers are exploring the use of heterogeneous memory systems that combine faster 3D-DRAMs, DDRx DRAM and non-volatile memories (NVMs). In this paper we evaluate prefetching in a flat-addressable heterogeneous memory comprising High Bandwidth Memory (HBM) and phase change memory (PCM). We find that large prefetch buffers (64MB) can outperform smaller buffer sizes (2MB), however it is not feasible to place such large buffers on the processor die. Hence, in this paper we evaluate an HBM-resident prefetch buffer that provides larger capacity and takes advantage of HBM’s higher memory bandwidth. We also present new prefetching policies that accommodate the differences in data path as compared to traditional prefetchers. We show that, reserving a small fraction (1/16th) of HBM memory to host a hardware prefetch buffer can improve IPC for a set of SPEC CPU2006 and HPC benchmarks by an average of 34% and a maximum of 98% over a baseline system with no-prefetching. Prefetching reduces total PCM traffic by 10% on average, which results in more memory traffic to the faster HBM, providing overall performance improvement. We found that such prfetching outperforms CAMEO and Alloy cache schemes on average by 60% and 10%, respectively.
منابع مشابه
Operating System Enhancements for Data-Intensive Server Systems
Recent studies on operating system support for concurrent server systems mostly target CPU-intensive workloads with light disk I/O activities. However, an important class of server systems that access a large amount of disk-resident data, such as the index searching server of large-scale Web search engines, has received limited attention. In this thesis work, we examine operating system techniq...
متن کاملJob-Speculative Prefetching: Eliminating Page Faults From Context Switches in Time-Shared Systems
When multiple applications have to time-share limited physical memory resources, they can incur significant performance degradation at the beginning of their respective time slices due to page faults. We propose a method to significantly improve memory system and overall performance in time-shared computers using job-speculative prefetching. While a given job or jobs are running, the operating ...
متن کاملImproving the effectiveness of software prefetching with adaptive executions
The effectiveness of software prefetching for tolerating latency depends mainly on the ability of programmers and/or compilers to: 1) predict in advance the magnitude of the run-time remote memory latency, and 2) insert prefetches at a distance that minimizes stall time without causing cache pollution. Scalable heterogeneous multiprocessors, such as network of computers (NOWs), present special ...
متن کاملPerformance of multiuser network-aware prefetching in heterogeneous wireless systems
We study the performance of multiuser document prefetching in a two-tier heterogeneous wireless system. Mobility-aware prefetching was previously introduced to enhance the experience of a mobile user roaming between heterogeneous wireless access networks. However, an undesirable effect of multiple prefetching users is the potential for system instability due to the racing behavior between the d...
متن کاملPREFETCHING vs MEMORY SYSTEM
Title Of Dissertation: PREFETCHING VS THE MEMORY SYSTEM : OPTIMIZATIONS FOR MULTI-CORE SERVER PLATFORMS Sadagopan Srinivasan, Doctor of Philosophy, 2007 Dissertation Directed by: Professor Bruce Jacob Department of Electrical and Computer Engineering This dissertation investigates prefetching scheme for servers with respect to realistic memory systems. A large body of research work has been don...
متن کامل